---  
title: |
    \vspace{-15mm} \setstretch{0.5} \textsc{\Large{}The Economic Impacts of COVID-19:} \
    \textsc{\large{}Evidence from a New Public Database Built Using Private Sector Data}
author: |
    Raj Chetty, John N. Friedman, Michael Stepner,\
    and the Opportunity Insights Team\thanks{The Opportunity Insights Economic Tracker Team as of July 2023 has
    consisted of Hamidah Alatas, Camille Baker, Harvey Barnhard, Matt
    Bell, Gregory Bruich, Tina Chelidze, Lucas Chu, Westley Cineus, Sebi
    Devlin-Foltz, Michael Droste, Dhruv Gaur, Federico Gonzalez, Rayshauna
    Gray, Abigail Hiller, Matthew Jacob, Tyler Jacobson, Margaret Kallus,
    Fiona Kastel, Laura Kincaide, Caitlin Kupsc, Sarah LaBauve, Lucía
    Lamas, Maddie Marino, Kai Matheson, Jared Miller, Christian Mott,
    Kate Musen, Danny Onorato, Sarah Oppenheimer, Trina Ott, Lynn Overmann,
    Max Pienkny, Jeremiah Prince, Sebastian Puerta, Daniel Reuter, Peter
    Ruhm, Tom Rutter, Emanuel Schertz, Shannon Felton Spence, Krista Stapleford,
    Kamelia Stavreva, Ceci Steyn, James Stratton, Clare Suter, Elizabeth
    Thach, Nicolaj Thor, Amanda Wahlers, Kristen Watkins, Alanna Williams,
    David Williams, Chase Williamson, Shady Yassin, Ruby Zhang, and Austin
    Zheng.}
date: |
    First Version: May 2020 \
    This Version: July 2023
abstract: |
    \noindent
    We build a publicly available database that tracks economic
    activity in the U.S. at a granular level in real time using anonymized
    data from private companies. We report weekly statistics on consumer
    spending, business revenues, job postings, and employment rates disaggregated
    by county, sector, and income group. Using the publicly available
    data, we show how the COVID-19 pandemic affected the economy by analyzing
    heterogeneity in its impacts across subgroups. High-income individuals
    reduced spending sharply in March 2020, particularly in sectors that
    require in-person interaction. This reduction in spending greatly
    reduced the revenues of small businesses in affluent, dense areas.
    Those businesses laid off many of their employees, leading to widespread
    job losses, especially among low-wage workers in such areas. High-wage
    workers experienced a “V-shaped” recession that lasted a few weeks,
    whereas low-wage workers experienced much larger, more persistent
    job losses. Even though consumer spending and job postings had recovered fully by December 2021, 
    employment rates in low-wage jobs remained depressed in areas that were initially hard hit, 
    indicating that the temporary fall in labor demand led to a persistent reduction in labor supply. 
    Building on this diagnostic analysis, we evaluate the impacts
    of fiscal stimulus policies designed to stem the downward spiral in
    economic activity. Cash stimulus payments led to sharp increases in
    spending early in the pandemic, but much smaller responses later in
    the pandemic, especially for high-income households. Real-time estimates
    of marginal propensities to consume provided better forecasts of the
    impacts of subsequent rounds of stimulus payments than historical
    estimates. Overall, our findings suggest that fiscal policies can
    stem secondary declines in consumer spending and job losses, but cannot
    restore full employment when the initial shock to consumer spending
    arises from health concerns. More broadly, our analysis demonstrates
    how public statistics constructed from private sector data can support
    many research and real-time policy analyses, providing a new tool
    for empirical macroeconomics. JEL: E01, E32.
thanks: |
    \noindent
    We thank the corporate partners who provided the underlying
    data used to construct the public database built in this paper: Affinity
    Solutions (especially Atul Chadha and Arun Rajagopal), Lightcast (Anton
    Libsch and Bledi Taska), CoinOut (Jeff Witten), Earnin (Arun Natesan
    and Ram Palaniappan), Homebase (Ray Sandza and Andrew Vogeley), Intuit
    (Christina Foo and Krithika Swaminathan), Kronos (David Gilbertson),
    Paychex (Mike Nichols and Shadi Sifain), Womply (Derek Doel and Ryan
    Thorpe), and Zearn (Billy McRae and Shalinee Sharma). We are very
    grateful to Nathaniel Hendren, who collaborated with us to launch
    the initial version of the database and helped conduct preliminary
    analyses for the first draft of this paper in Spring 2020. We are
    also grateful to Ryan Rippel of the Gates Foundation for his support
    in launching this project and to Gregory Bruich for early conversations
    that helped spark this work. We thank David Autor, Gabriel Chodorow-Reich,
    Haley O’Donnell, Emmanuel Farhi, Jason Furman, Steven Hamilton, Erik
    Hurst, Xavier Jaravel, Lawrence Katz, Fabian Lange, Emmanuel Saez,
    Ludwig Straub, Danny Yagan, and numerous seminar participants for
    helpful comments. The work was funded by the Chan-Zuckerberg Initiative,
    Bill & Melinda Gates Foundation, Overdeck Family Foundation, and
    Andrew and Melora Balson. The project was approved under Harvard University
    IRB 20-0586.
nocite: |
    @chetty2014land, @coibion2020labor
bibliography: |
    bibliography.bib
---

\clearpage
\newgeometry{left = 1in, right = 1in, bottom = 1in, top = 1in}
\pagenumbering{arabic}
\setcounter{page}{1}
\setstretch{1.5}

<!----------------------------------------------------------------------------->
<!-- Main Text ---------------------------------------------------------------->
<!----------------------------------------------------------------------------->

# Introduction {#sec:Introduction}

Since @kuznets1941, macroeconomic policy decisions have
been made on the basis of publicly available statistics constructed
from recurring surveys of households and businesses conducted by the
federal government. Although such statistics have great value for
understanding total economic activity, they have two limitations.
First, survey-based data typically cannot be used to assess variation
across geographies or subgroups; due to relatively small sample sizes,
most statistics are typically reported only at the national or state
level and breakdowns for demographic subgroups or sectors are unavailable.
Second, such statistics are often available only at low frequencies,
often with a significant time lag.[^cex]

[^cex]: For example, data on consumer spending disaggregated by geography
are only available for selected large metro areas at a bi-annual level
in the Consumer Expenditure Survey (CEX).

In this paper, we address these challenges by (1) building a public
[database](https://tracktherecovery.org) that measures spending, 
employment, and other outcomes at a high-frequency, granular level 
using anonymized transaction data collected by companies in the private 
sector and (2) demonstrating how this new database can be used to 
obtain insights into the effects of the coronavirus pandemic (COVID-19) 
and policy responses in near real-time -- within three weeks of the 
shock or policy change of interest.

We organize the paper in three parts. First, we construct statistics
on consumer spending, business revenues, employment rates, job postings,
and other key indicators -- disaggregated by geographic area (county
or ZIP code), industry, and income level -- by combining data from
credit card processors, payroll firms, and financial services firms.
The main challenge in using transactional data collected by private
companies (which we refer to as “private sector data” in what
follows) to measure economic activity is a tension between research
value and privacy protection. For research, it is beneficial to use
raw, disaggregated data -- ideally down to the individual consumer
or business level -- to maximize precision and flexibility of research
designs. But from a privacy perspective, it is preferable to aggregate
and mask data to reduce the risk of disclosure of private information.
To balance these conflicting interests, one must construct statistics
that are sufficiently aggregated and masked to mitigate privacy concerns
yet sufficiently granular to support research. Our goal is to demonstrate
how one can produce public statistics that deliver insights analogous
to those obtained from the underlying confidential microdata, thereby
improving the transparency, timeliness, and reproducibility of empirical
macroeconomic research [@transparency2014].

We construct publicly available series suitable for research from
raw transactional data in a series of steps. We first develop algorithms
to clean the raw data by removing data artifacts and smoothing seasonal
fluctuations. Raw transactional data can exhibit sharp fluctuations
and noise driven by changes in clientele, platform design, or exogenous
events such as holidays [@leamer2011; @mcelroy2018].
We systematically examine each series for such artifacts and develop
methods to address them. Next, we take steps to limit privacy loss
by reporting only changes since January 2020 (rather than raw levels),
masking small cells, and pooling data from multiple companies to comply
with regulations governing the disclosure of material non-public information.
After establishing these protocols, we report the final statistics
using an automated pipeline that ingests data from businesses and
publishes processed data, typically within a week after the relevant
transactions occur.

The new data series we construct are a complement to, rather than a
replacement for, existing public statistics obtained from representative
surveys. The benefits of our data are their granularity and frequency
-- providing daily or weekly data for sectors and subgroups down
to the county level. The drawback is that there are no ex-ante guarantees
that the data provide a representative picture of economic activity
because any one company's clients are not necessarily a representative
sample of U.S. households or firms. We discuss these tradeoffs in
greater detail in Section \ref{subsec:Data-Limitations}. To address
these challenges, we benchmark each series to publicly available statistics
from representative surveys and create series that track the survey-based
measures closely, making ongoing adjustments to series that diverge
from national statistics (e.g., because of changes in a data provider's
clients). Ultimately, the statistics we construct from transactional
data provide an additional set of (imperfect) signals on economic
activity that can in principle yield better statistical inferences
when combined with existing survey-based statistics.[^survey_limitations] 
Whether these data yield valuable new insights in practice is an
empirical question.

[^survey_limitations]: Survey-based statistics themselves do not necessarily provide “ground
truth” because of sampling error, recall error, and growing non-response
bias [@meyer2021; @dutz2021]. Thus, even for longer-term
inferences at the national level, combining information from transactional
data with information from surveys can be valuable.

In the second part of the paper, we evaluate the empirical value of
the new data by using them to analyze the economic impacts of COVID-19,
focusing on the period from March 2020 to December 2021 -- covering
both the decline in economic activity and the recovery to baseline
spending levels. To evaluate how far one can get solely with public
statistics rather than confidential data, we deliberately conduct
our empirical analysis using the aggregate statistics we release publicly.[^replication]

[^replication]: We provide a [replication kit](https://github.com/OpportunityInsights/EconomicTracker)
that generates all of the results in the paper from publicly available
data. We use non-public data for certain robustness checks and validation
analyses reported in the Appendix (as documented in the replication
kit).

National accounts reveal that GDP fell in the second quarter of 2020
following the COVID-19 shock primarily because of a reduction in consumer
spending. We find that spending fell primarily because *high-income*
households started spending much less, using the median household
income in the ZIP code where the cardholder lives as a proxy for household
income.[^validate_zip_income] As of April 2020, {{share_decline_early_q4}}% of the reduction
in total spending since January 2020 came from households who lived
in ZIP codes with median income in the top quartile, while {{share_decline_early_q1}}%
came from households who lived in ZIP codes with median income in
the bottom quartile. This is both because the rich account for a larger
share of spending to begin with and because they cut spending more
in percentage terms. Spending reductions were concentrated in services
that require in-person physical interaction, such as hotels and restaurants,
consistent with contemporaneous work by @SecondMeasureAlexanderKarger
and @farrellJPMorgan2020. These findings suggest that high-income
households reduced spending primarily because of health concerns rather
than a reduction in income or wealth.

[^validate_zip_income]: We verify the quality of our publicly available ZIP-code level proxies
for income by showing that our estimates of spending by income group
during the pandemic are closely aligned with those of @farrellJPMorgan2020,
who observe household income directly for JPMorgan Chase clients in
confidential microdata.

Next, we leverage geographic variation in the demand shocks businesses
face to identify the impacts of the consumer spending shock on businesses.
In-person services are typically produced by small businesses (such
as restaurants) that serve customers in their local area. The revenues
of those small businesses in high-income, dense areas (high-rent ZIP
codes) fell by {{revenue_apr_rent_p100}}% between January
and mid-April 2020, compared with {{revenue_apr_rent_p5}}% in the
lowest-rent ZIP codes.

As businesses lost revenue, they passed the shock on to their employees,
particularly low-wage workers. Postings for jobs with low skill requirements
fell sharply in April 2020, with a much larger reduction in high-rent
areas than low-rent areas. Postings for jobs with high skill requirements
fell much less, and exhibit no cross-sectional gradient with respect
to rent. As a result of the labor demand shock, employment rates fell
by {{emp_apr_q1}}% for workers with wage rates in the bottom
quartile of the pre-COVID wage distribution as of April 15, 2020 (the
trough of the COVID recession), consistent with results first established
using other confidential payroll data sources by @ADPHurstetal.
For those in the top wage quartile, employment rates fell by {{emp_apr_q4}}%.
Low-wage individuals working at small businesses in affluent areas
were especially likely to lose their jobs. At small businesses located
in the highest-rent ZIP codes, {{emp_change_apr_rent_p100}}%
of workers were laid off within two weeks after the COVID crisis began;
in the lowest-rent ZIP codes, {{emp_change_apr_rent_p5}}%
lost their jobs.

Employment levels for workers in the top wage quartile rebounded quickly,
returning to pre-COVID levels by the end of June 2020. In contrast,
employment recovered much more slowly for low-wage workers. The total
number of jobs in the bottom quartile of the pre-pandemic wage distribution
remained {{notexpl_pp}}% below baseline even as of December
2021 (adjusting for wage growth). Why did employment rates for low-wage
workers remain persistently lower? Unlike at the start of the pandemic,
the source of lower employment rates at the end of 2021 was not a
lack of labor demand: total consumer spending and low-skilled job
postings were well above pre-COVID baseline levels throughout 2021.
Furthermore, job postings for low-skilled workers were just as high
in high-rent areas as they were in low-rent areas by December 2021.
However, employment rates for low-wage workers continue to exhibit
a sharp gradient with respect to rent, with employment levels (adjusted
for wage growth) returning to pre-COVID baseline rates in the lowest-rent
areas but remaining {{emp_dec_rent_p100}}pp below pre-COVID
levels in the highest-rent areas. Employment rates in December 2021
were much more strongly related to the size of the initial shock to
economic activity -- e.g., the change in employment rates as of April
2020 -- than contemporaneous factors such as COVID case rates or
unemployment benefit levels. In short, the initial labor demand shock
induced by the reduction in aggregate demand in March 2020 led to
a persistent reduction in labor supply among low-wage workers in the
hardest-hit areas. As a result, business cycle dynamics during the
COVID crisis were not symmetric: on the way down, spending and employment
fell in lockstep, but on the way back, they did not rise together,
echoing patterns documented in the Great Recession [@yagan2019employment].

In the third part of the paper, we examine the scope for stabilization
policies to break the chain of events documented above. We focus on
the impacts of stimulus payments, whose goal was to mitigate reductions
in economic activity by boosting aggregate spending. The federal government
sent households stimulus checks at three points during the crisis:
April 15, 2020, January 4, 2021, and March 17, 2021. Using an event
study design, we find that the stimulus payments made in April 2020
increased spending uniformly across the household income distribution
(again proxying for income based on ZIP code), with both low- and
high-income households increasing spending substantially in the days
after they received checks, consistent with evidence from @BakerStimulus
and @farrellJPMorgan2020 using individual-level administrative
data.

In contrast, the January 2021 payments had highly heterogeneous impacts
across the income distribution: low-income households continued to
spend a substantial fraction of their stimulus checks, but high-income
households (those living in the top quartile of ZIP codes by median
income) spent virtually none of the money they received. The impacts
of the stimulus changed sharply over the course of the recession because
of the heterogeneous spending dynamics documented above: high-income
households cut spending sharply but did not lose much income, and
as a result had built up considerable savings by January 2021, sharply
reducing their marginal propensity to consume. Because our spending
data are available with a short lag, we were able to establish this
result three weeks after the second stimulus payment. These results
were cited in policy debates regarding who should receive the March
2021 stimulus payments, which ultimately concluded with policymakers
lowering the income threshold for eligibility relative to initial
proposals.

Finally, as predicted based on impacts of the January 2021 stimulus, we
find that the March 2021 stimulus payments increased spending for
low-income households, but had little impact on spending for high-income
households who remained eligible. Hence, estimates of marginal propensities
to consume in January 2021 provided much better forecasts of the impacts
of the March 2021 stimulus payments than historical estimates from
prior recessions, which suggested there would be little heterogeneity
in MPCs by income level [@sahm2012; @broda2014],
or even estimates from just months earlier in the same recession (April
2020). This example demonstrates how public statistics constructed
from private sector data can support a “real time” approach to
macroeconomic policy, where policies are adjusted based on current
evidence on their impacts rather than relying solely on historical
predictions from other economic environments.

The data can analogously be used to analyze the impacts of other policies
beyond stimulus payments that were implemented during the COVID-19
crisis. We find that results from our publicly available data match
those from studies that use confidential data sources closely. For
instance, we find that state-ordered shutdowns and reopenings of economies
had modest impacts on economic activity [as in @goolsbee2021fear]
and that loans to small businesses as part of the Paycheck Protection
Program (PPP) had small impacts on employment rates 
[as in @autor2020evaluation; @PaycheckProtectionGranjaZwick; and @hubbardppp]. 
Additionally, several studies use the new data constructed here to evaluate the
impacts of many other policies, from the impacts of unemployment benefit
changes [as in @casado2020effect] to eviction moratoria [as
in @an2021more].

We conclude by analyzing whether the combination of government policies
was adequate to stem the decline in economic activity set off by the
reduction in consumer spending documented above. Consumer spending
fell sharply in April 2020 in the dense, affluent areas where many
low-wage workers lost their jobs, portending the start of a downward
spiral of secondary effects stemming from the initial aggregate demand
shock. However, the relationship between consumer spending and the rate of 
local job loss flattened sharply by July 2020. Spending recovered to baseline 
levels or above baseline, even in places where many low-wage workers lost their 
jobs—presumably because of the substantial infusion of income to such areas in the 
form of fiscal stimulus, unemployment benefits, and other programs that led to an 
increase in disposable income at the bottom of the distribution [@blanchet2022real].

Overall, our findings suggest that fiscal policies can be very valuable
for limiting secondary declines in consumer spending arising from
a loss of income as workers lose their jobs. However, fiscal policy
itself does not have the capacity to restore full employment when
the initial shock to consumer spending arises from health concerns
[@guerrieri_macro_covid]. Furthermore, even after health
concerns have abated, changes in labor supply among those who lost
their jobs may lead to persistent reductions in employment. It may
therefore be useful to target re-employment policies to individuals
who held low-wage jobs in places that suffered the largest job losses
[@austin2018jobs]. Our data provide a way to monitor the areas
and sectors in which job losses persist, information that can be used
to target and evaluate such programs going forward.

Beyond showing how the COVID-19 pandemic affected economic activity,
the broader contribution of this study is the construction of a new
public database of granular macroeconomic statistics that opens new
avenues for empirical macroeconomics, from finer analysis of heterogeneous
impacts across subgroups and areas to real-time policy fine tuning.
Importantly, such analyses can be conducted by many researchers and
policy analysts, not just those who can secure access to confidential
data and devote resources to cleaning and harmonizing it. In this
sense, the data assembled here provide a prototype for a new system
of real-time, granular national accounts that can be refined in future
work, much as @kuznets1941 and Summers and Heston [-@summers1984improved; -@summers1991penn] 
developed prototypes for national accounts that were refined in subsequent work
[e.g., @feenstra2015next]. Going forward, our intention is
to continue to maintain and refine this database in collaboration
with researchers at government statistical agencies, with the ultimate
aim of creating a complement to survey-based statistics that yield
further detail on economic activity.

Our work builds on two literatures: a longstanding literature on macroeconomic
measurement and a recent literature on the economics of pandemics.
In the macroeconomic measurement literature, our work is most closely
related to studies showing that private sector data sources can be
used to forecast government statistics; see @abraham2019big
for an overview of this work. In the COVID-19 pandemic literature,
numerous papers have used confidential private sector data to analyze
consumer spending; see @vavra2021 and @brodeur2021
for surveys. The contribution of the present study is to present a
comprehensive characterization of how COVID-19 and subsequent stabilization
policies affected economic activity by disaggregating data across
geographic areas and subgroups at a high frequency. We discuss specific
connections with prior work in the context of presenting our results.

The rest of this paper is organized as follows. The next section describes
how we construct the data series we make public. In Section \ref{sec:Impacts},
we analyze the effects of COVID-19 on spending, revenue, and employment.
Section \ref{sec:Policy-evals} analyzes the impacts of stimulus and
other government policies enacted to mitigate COVID's impacts. Section
\ref{sec:Conclusion} concludes. Technical details are available in
an online appendix, and the data used to produce the results can be
 [downloaded online](https://github.com/OpportunityInsights/EconomicTracker).

# Construction of the Public Database {#sec:Data}

We use anonymized data from several private companies to construct
public indices of consumer spending, small business revenue, job postings,
and employment rates. All of the data series described below can be
freely downloaded from [the Economic Tracker website](http://www.tracktherecovery.org).

We release each data series at the highest available frequency using
an automated pipeline that ingests data from data providers, constructs
the relevant statistics, conducts quality control tests, and outputs
the series publicly. Appendix \ref{sec:Appx-Pipeline} details the
engineering of this pipeline.

## Methods 

We disaggregate each series by industrial sector, county, and income
quartile wherever feasible. To systematize our approach and facilitate
comparisons between series, we adopt the following four principles
when constructing each series.

First, we remove artifacts in raw data that arise from changes in
data providers' coverage or systems. For instance, firms' clients
often change discretely, sometimes leading to discontinuous jumps
in series, particularly in small cells. We systematically search for
large jumps in series, study their root causes by consulting with
the data provider, and address such discontinuities by imposing continuity
using series-specific methods described below.

Second, we smooth low- and high-frequency fluctuations in the data.
We address high-frequency fluctuations through aggregation, e.g. by
reporting 7-day moving averages to smooth fluctuations across days
of the week. Certain series -- most notably consumer spending and
business revenue -- exhibit strong lower-frequency seasonal fluctuations
that are autocorrelated across years (e.g., a surge in spending around
the holiday season). We de-seasonalize such series by indexing each
week's value in 2020 relative to corresponding values for the same
week in 2019.

Third, we take a series of steps to protect the confidentiality of
businesses and their clients. Instead of reporting levels of each
series, we report indexed values that show percentage changes relative
to mean values in January 2020.[^indexing] We suppress small cells and exclude outliers to meet privacy and
data protection requirements, with thresholds that vary across datasets
as described below. For data obtained from publicly traded firms --
whose ability to disclose data is restricted by Securities and Exchange
Commission regulations governing the disclosure of material non-public
information -- we combine data from multiple firms so that the statistics
we report do not reveal information about any single company's activities.

[^indexing]: We always index to January 2020 after summing to a given cell (e.g.
geographic unit, industry, etc.) rather than at the firm or individual
level. This dollar-weighted approach overweights bigger firms and
higher-income individuals, but leads to smoother series and is more
relevant for certain macroeconomic policy questions (e.g., changes
in aggregate spending).

Finally, we address the challenge that our data sources capture information
about the customers each company serves rather than the general population.
Instead of attempting to adjust for this non-representative sampling,
we characterize the portion of the economy that each series represents
by comparing each sample we use to national benchmarks and label the
sector and population subgroup that each series represents.

We follow these four broad principles to construct every public data
series that we release, while adapting the specific data processing
methodology to the specific characteristics of each data source.

## Data Series {#sec:Data-Series}

This section provides an overview of how we produce each data series.
We summarize the data sources and give an overview of our key processing
steps in Appendix Table \ref{tab:data_processing}, and provide summary
statistics on sample sizes for each series in Appendix Table \ref{tab:sample_size}.

### Consumer Spending {#subsec:Data-Spending}

We measure consumer spending using aggregated and anonymized
data on credit and debit card spending collected by [Affinity Solutions Inc](https://www.affinity.solutions),
a company that aggregates consumer credit and debit card spending
information to support a variety of financial service products, such
as loyalty programs for banks. Affinity Solutions captures nearly
10% of debit and credit card spending in the U.S. We obtain raw data
from Affinity Solutions disaggregated by county, quartile of ZIP code median income,
industry and day starting from January 1, 2019.

We process the raw Affinity data into an analytical series following
the four steps above---removing artifacts and outliers, de-seasonalizing,
indexing, and benchmarking---and describe each step in detail in
Appendix \ref{sec:Appx-Affinity}. As an example of our data processing
methods, we detect and remove discontinuous breaks caused by entry
or exit of card providers from the sample. Because these card providers
have geographically concentrated customer bases, the number of active
cards in a county exhibits a sharp upward or downward spike when the
sample of local card providers changes (Appendix Figure \ref{fig:breaks_affinity}).
We identify these sudden changes by analyzing the number of unique
cards from each county with at least one transaction in each week,
using a Supremum Wald test for a structural break at an unknown break
point. If we identify a structural break in week $t$, we impute spending
changes in weeks $\{t-1,t,t+1\}$ using the mean week-to-week percent
change in spending excluding all counties with a structural break
in the same state.

The Affinity series has broad coverage across industries, but over-represents
categories in which credit and debit cards are used for purchases
(see Appendix Figure \ref{fig:ind_shares} discussed in Appendix \ref{sec:Appx-Affinity}).
We therefore view the Affinity series as providing statistics that
are representative of total card spending, but not total consumer
spending.

### Small Business Revenue {#subsec:Data-Revenue}

We obtain data on small business transactions and revenues from [Womply](https://www.womply.com/),
a company that aggregates data from several credit card processors
to provide analytical insights to small businesses and other clients. 
Womply receives data from approximately 500,000 small businesses,
which corresponds to more than 5% of small businesses with 1-499
employees in the U.S. in 2020 [@sbprofile2020]. In contrast
to the Affinity series on consumer spending, which is a cardholder-based
panel covering total spending, Womply is a firm-based panel covering
total revenues of small businesses disaggregated by county, sector
and week. Another key distinction is that Womply data
measures the location of the business as opposed to where the cardholder
lives.

We process this small business data following each of the same four
broad steps as with the consumer spending data from Affinity, but
we tailor the methodology to the structure of the Womply data, as
detailed in Appendix \ref{sec:Data-Womply}. To take one example,
there are again discontinuous breaks in the number of observed small
businesses due to churn in the observed sample of payment processors,
analogous to the entry and exit of card providers in consumer spending
data. However, unlike the repeated cross-sections of consumer spending
data, we can address such sample churn more directly using the panel
data on small businesses. In each calendar year, we follow the sample
of businesses operating during the first week of the year: no new
businesses enter the panel mid-year. We must still detect cases where
a payment processor exits the sample, and we adopt a similar approach
to detecting discontinuous breaks as we applied to consumer spending
data. We look for sharp drops in businesses operating at the state
and national levels (Appendix Figure \ref{fig:breaks_womply}).[^womply_breaks] 
After adjusting for these discontinuous exits, we proceed with the
rest of the steps described in Appendix \ref{sec:Data-Womply} to
construct a seasonally-adjusted series for total small business revenue.

[^womply_breaks]: We use a higher level of geographic aggregation to detect breaks here
than the county-level aggregation used for consumer spending because
the number of small businesses is an order of magnitude smaller than
the number of active credit and debit cards, and so tests for structural
breaks have less power.

Womply revenues are broadly distributed across sectors. A larger share
of the Womply revenue data come from industries that have a larger
share of small businesses, such as food services, professional services,
and other services, as one would expect given that the Womply data
only cover small businesses (Appendix Figure \ref{fig:ind_shares}).

### Job Postings {#subsec:Data-Jobs}

We obtain data on job postings from 2007 to present from [Lightcast](https://lightcast.io/)
(formerly known as Burning Glass Technologies). Lightcast aggregates
nearly all jobs posted online from approximately 40,000 online job
boards in the United States. Lightcast then removes duplicate postings
across sites and assigns attributes including geographic locations,
required job qualifications, and industry.

We receive raw data from Lightcast on job postings disaggregated by
industry, week, job qualifications and county.[^industry_def] 
We report job postings at the weekly level, expressed as changes
in percentage terms relative to the first four complete weeks of 2020.

[^industry_def]: Industry is defined using select [NAICS supersectors](https://www.bls.gov/sae/additional-resources/naics-supersectors-for-ces-program.htm),
aggregated from 2-digit NAICS classification codes. Job qualifications
are defined using ONET [job zones](https://www.onetonline.org/help/online/zones),
which classify jobs into five groups based on the amount of preparation
they require. We also obtain analogous data broken down by educational
requirements.

Lightcast provides a sample that is representative of private sector
job postings in the U.S. Appendix Figure \ref{fig:bg_jolts} shows
that the distribution of industries in the Lightcast data is well-aligned
with the Bureau of Labor Statistics’ Job Openings and Labor Market
Turnover Survey ([JOLTS](https://www.bls.gov/jlt/)), consistent
with @carnevale2014understanding.

### Employment {#subsec:Data-Employment}

We use three data sources to obtain information on employment rates:
payroll data from [Paychex](https://www.paychex.com/) and [Intuit](https://www.intuit.com/)
and worker-level data from [Earnin](https://www.earnin.com).
We describe each of these data sources in turn and then discuss how
we construct a weekly series that is broadly representative of private
non-farm employment rates in the U.S (see Appendix Tables \ref{tab:industry_share}
and \ref{tab:wage_oes}).

*Paychex and Intuit*. Paychex provides payroll services to
approximately 670,000 small- and medium-sized businesses across the
United States and pays 8% of U.S. private-sector workers [@paychexemploymentwatch].
To track how employment changes vary across the wage distribution,
we separate employees into four groups based on their hourly wage
rates. We split the sample into the four groups whose wages (if they
work full time for the full year) would be above/below 100%, 150%
and 250% of the federal poverty line (FPL). For convenience, we refer
to these groups as “wage quartiles” because these thresholds group
workers approximately into quartiles before the pandemic.[^poverty_thresholds]
This approach allows us to track the total number of jobs in different
parts of the wage distribution, adjusting for inflation over time.
We obtain aggregate weekly data on total employment for each hourly
wage group by county, industry (two-digit NAICS), firm size bin, and
pay frequency.

[^poverty_thresholds]: In January 2020, the thresholds were ${{wage_threshold_q1q2}},
${{wage_threshold_q2q3}} and ${{wage_threshold_q3q4}}
and the four bins in ascending order by wage contained {{pct_emp_cps_q1}}%,
{{pct_emp_cps_q2}}%, {{pct_emp_cps_q3}}%, and {{pct_emp_cps_q4}}%
of CPS respondents. The federal poverty line (FPL) is updated annually
at the beginning of each year. We use the annual FPL to set the thresholds
each January and smoothly adjust the thresholds within the year using
CPI inflation, as described in Appendix \ref{sec:Data-Employment}.

Intuit offers payroll services to businesses as part of its Quickbooks
program, covering approximately one million businesses as of January 2020. 
Businesses that use Quickbooks tend to be very small (fewer
than 20 employees). We obtain anonymized, aggregated data on month-on-month
and year-on-year changes in total employment (the number of workers
paid in the prior month) based on repeated cross-sections. We construct
a national series from population-weighted averages of state changes
in each month.

To protect business privacy and maximize precision, we combine Paychex
and Intuit data to construct our primary employment series. We clean
this series for analysis following the general principles described
above (see Appendix \ref{sec:Data-Employment}).[^integer_bunching] 
We do not seasonally adjust our employment series because we have
incomplete data in 2019; fortunately, seasonal fluctuations in employment
are an order of magnitude smaller than those in spending (Appendix
Figure \ref{fig:seasonal_fluct}) and hence are unlikely to affect
our results. 

[^integer_bunching]: As an example of the specific data processing challenges that we address in constructing the employment series, bunching at integer values
in the wage distribution generates discontinuities in the number of
workers assigned to each wage group as the thresholds for the groups
are updated due to inflation. For example,
when the threshold for the lowest wage group crosses $14/hour, a
discrete mass of workers who were previously a part of the second
quartile are now defined as being in the bottom quartile, causing
a discontinuity in both series. To address this issue, we spread workers
out from the whole number wages by adding a random number between
-0.5 and 0.5 to their hourly wage, transforming the point mass at
the integer *wage* into a uniform distribution between $[wage-0.5,wage+0.5]$
(see Appendix \ref{subsec:Internal-Data-Processing} for details).

*Earnin*. Earnin is a financial management application that
provides its members with access to their income as they earn it,
in advance of their paychecks. Because its users tend to have lower
income levels, Earnin primarily provides information on employment
for low-wage workers. We obtain anonymized data from Earnin from January
2020 onward, describing the date a paycheck is received, workplace
ZIP code, firm size, industry and amount of pay. Earnin complements
the firm-based payroll datasets discussed above by providing a worker-level
sample with more granular ZIP-level geographic identifiers. However,
because workers self-select into the sample when they enter or exit
the Earnin customer base, the labor market disruptions of the pandemic
generate substantial sample selection over time. We therefore use
the Earnin sample only to study the first six months of the COVID
pandemic, from March to September 2020, when the sample is relatively
stable. We convert the Earnin data into an employment series using
an approach similar to that used to construct the combined Paychex
and Intuit employment series (detailed in Appendix \ref{sec:Data-Employment}). 

### Public Data Sources: UI Records, COVID-19 Incidence, and Google Mobility Reports

In addition to the new private sector data sources described above,
we also collect and use three sets of data from public sources to
supplement our analysis: data on unemployment benefit claims obtained
from the Department of Labor and state government agencies; data on
COVID-19 cases and deaths obtained from the New York Times, Johns
Hopkins, the CDC and the U.S. Department of Health and Human Services;
and data on the amount of time people spend at home vs. other locations
obtained from Google's COVID-19 Community Mobility Reports. Further
details on these data sources are provided in Appendices \ref{sec:Data-UI}
to \ref{sec:Data-Mobility}.

## Limitations {#subsec:Data-Limitations}

The rest of this paper demonstrates how the database assembled here
is valuable for uncovering the economic impacts of COVID-19. However,
these new data also have three important limitations that users should
weigh, especially in future applications.

First, each data series we construct necessarily reflects the clientele
of the data provider, and thus does not provide guarantees of population
representativeness. We take several steps to verify that each data
series is nationally representative: we compare the cross-sectional
composition of each series against nationally representative statistics
in this section, and compare trends in each series during the COVID-19
pandemic to data from publicly available benchmarks in the next section.
But it is impossible to verify the representativeness of each level
of disaggregation (e.g. county-level consumer spending), precisely
because no existing public datasets provide similarly granular and
high-frequency data -- hence the value of these novel data sources.
Given this limitation, it is valuable to verify empirical results
using multiple different data series and triangulate findings against
whatever data are available from representative surveys at coarser
levels of aggregation, as we do in our analysis below.

Second, the series we construct have sampling error from both idiosyncratic
variation across firms and households as well as from changing client
bases and business closures. The economic shocks associated with COVID-19
were especially large, making their impacts easy to detect even in
the presence of such errors. In Section \ref{subsec:Impacts-Employment},
we show that the data series are sufficiently reliable to detect moderate-sized
changes in economic activity at local levels (e.g., employment rate
changes of {{rmse_top_50_times2}}pp at the commuting zone level
for the 50 largest CZs). Smaller fluctuations -- e.g., monthly innovations
in employment rates during periods of normal economic growth -- will
not be distinguishable from sampling noise in these datasets.

Third, while our data cover certain sectors well -- such as spending
on items covered by credit and debit cards -- they entirely exclude
other sectors, such as spending on housing and durable goods such
as vehicles. In the context of the COVID-19 pandemic, the data series
we construct overlap with the sectors that exhibit the largest changes
in economic activity (see Section \ref{sec:Impacts}), but in other
applications that may not be the case.

In light of these limitations, the data constructed here should be
used to complement representative survey-based statistics, not
as a substitute. Furthermore, the present version of the database
is a prototype that can be improved over time. For example, noise
in estimates of employment rates from changes in payroll firms' clientele
can be mitigated by chaining together estimates of employment changes
from rotating panels of firms instead of relying on repeated cross-sections.
Adding additional data partners can address gaps in coverage, such
as spending on housing. Such refinements could mitigate the limitations
described above, though statistics from representative surveys will
remain essential as benchmarks.

# Economic Impacts of COVID-19 {#sec:Impacts}

According to the Bureau of Economic Analysis [-@USBEA_NIPA],
GDP fell by ${{gdp_dollars}} trillion (an annualized rate of {{gdp_percent}}%)
from the first quarter of 2020 to the second quarter of 2020, shown
by the first bar in Appendix Figure \ref{fig:gdp_changes_q1_q2_2020}.
GDP fell primarily because of a reduction in personal consumption
expenditures (consumer spending), which fell by ${{pce_dollars}}
trillion. Government purchases and net exports did not change significantly,
while private investment fell by ${{inv_dollars}} trillion.[^private_investment] 
We therefore begin our analysis by studying the determinants of this
sharp reduction in consumer spending. We then turn to examine downstream
impacts of the reduction in consumer spending on business activity
and the labor market.

[^private_investment]: Most of the reduction in private investment was driven by a reduction
in inventories and equipment investment in the transportation and
retail sectors, both of which are plausibly a response to reductions
in current and anticipated consumer spending. In the first quarter
of 2020, consumer spending accounted for an even larger share of the
reduction in GDP, further supporting the view that the initial shock
to the economy came from a reduction in consumer spending [@USBEA_NIPA].

## Consumer Spending {#subsec:Impacts-Spending}

We analyze consumer spending using data on aggregate credit and debit
card spending. National accounts data show that spending that is well
captured on credit and debit cards -- essentially all spending excluding
housing, healthcare, and motor vehicles -- fell by approximately ${{cc_dollars}} 
trillion between the first quarter of 2020 and the second quarter of 2020, 
comprising {{pce_decline_cc_share}}% of the total reduction in personal consumption expenditures.[^healthcare_expenditures]

[^healthcare_expenditures]: The rest of the reduction is largely accounted for by healthcare expenditures;
housing and motor vehicle expenditures did not change significantly.

*Benchmarking.* Our card spending series is well aligned with
the Advance Monthly Retail Trade Survey (MARTS), one of the main inputs
used to construct the national accounts.[^vs_marts] 
Appendix Figure \ref{fig:affinity_vs_marts_month_on_month} plots
the month-on-month changes in spending on retail services (excluding
auto-related expenses) and food services: both series track each other
before the pandemic, then food services spending drops rapidly in
March and April 2020, while total retail spending fluctuates much
less during the pandemic. The root mean square error of the Affinity
series relative to the MARTS is {{aff_mrts_ret_rmse_int}}
to {{aff_mrts_food_rmse_int}} pp, which is small relative
to the fluctuations induced by COVID, but calls for caution in evaluating
smaller shocks. Appendix Figure \ref{fig:affinity_mrts} expands this
analysis to other categories by plotting the change in spending from
January to April 2020 in the Affinity spending series against the
decline in consumer spending as measured in the MARTS. Despite the
fact that the MARTS category definitions are not perfectly aligned
with those in the card spending data, the relative declines are generally
well aligned across sectors, with a correlation of {{aff_mrts_ind_corr}}.[^vs_coinout]

[^vs_marts]: The series are not perfectly comparable because the category definitions
differ slightly across the datasets. For example, we observe food
and accommodation services combined together in the card data but
only food services in the MARTS. In addition, the MARTS includes corporate
card transactions, whereas we exclude them in order to isolate consumer
spending. Hence, we would not expect the series to track each other
perfectly even if the card spending data provided a perfect representation
of national spending patterns.

[^vs_coinout]: One specific source of potential bias in our spending series is that
it does not include cash transactions and thus could be biased by
potential substitution from cash to credit card purchases. We evaluate
this concern using receipts data from CoinOut, which allows us to
measure cash spending on groceries (see Appendix \ref{subsec:Data-Affinity-Masking-and-Publication}).
In practice, trends in card and cash spending track each other closely
(Appendix Figure \ref{fig:affinity_coinout}). These results -- together
with the fact that our card spending series closely track estimates
from the MARTS -- indicate that aggregate fluctuations in card spending
do not appear to have been offset by opposite-signed changes in cash
spending.

*Heterogeneity by Income.* We begin by examining spending changes
by household income. We do not directly observe cardholders' incomes
in our data; instead, we proxy for cardholders' incomes using the
median household income in the ZIP code in which they live (based
on data from the 2014-18 American Community Survey). ZIP codes are
strong predictors of income because of the degree of income segregation
in most American cities; however, they are not a perfect proxy for
income and can be prone to bias in certain applications, particularly
when studying tail outcomes [@chetty2020college]. To evaluate
the accuracy of our ZIP code imputation procedure, we compare our
estimates to those in contemporaneous work by @farrellJPMorgan2020,
who observe cardholder income directly based on checking account data
for clients of JPMorgan Chase. Our estimates are closely aligned with
those estimates, suggesting that the ZIP code proxy is reasonably
accurate in this application.[^vs_jpmorgan]

[^vs_jpmorgan]: @farrellJPMorgan2020 report an eight percentage point larger
decline in spending for the highest income quartile relative to the
lowest income quartile in the second week of April. Our estimate of
the gap at that time is also eight percentage points, although the
levels of the declines in our data are slightly smaller in magnitude
for both groups.

Figure \ref{fig:spending_changes_by_income_quartile} plots a seven-day
moving average of total daily card spending for households in the
bottom vs. top quartile of ZIP codes based on median household income.
Spending fell sharply on March 15, 2020, when the National Emergency was
declared and the threat of COVID became widely discussed in the United
States. Spending fell from ${{spend_level_q4_feb2020}} billion
per day in February 2020 to ${{spend_level_q4_mar2020}} billion
per day between March 25 and April 14, 2020 (a {{spend_decline_febtomar2020_q4}}%
reduction) for high-income households; the corresponding change for
low-income households was ${{spend_level_q1_feb2020}} billion
to ${{spend_level_q1_mar2020}} billion (a {{spend_decline_febtomar2020_q1}}%
reduction).

Because high-income households cut spending more in percentage terms
and accounted for a larger share of aggregate spending to begin with,
they accounted for a much larger share of the decline in total spending
in the U.S. than low-income households. In Column 2 of Appendix Table
\ref{tab:spending_changes}, Panel A, we estimate that as of mid-April 2020,
top-quartile households accounted for {{share_decline_early_q4}}%
of the aggregate spending decline after the COVID shock, while bottom-quartile
households accounted for only {{share_decline_early_q1}}%
of the decline.

This gap in spending patterns by income grew even larger over time.
By August 2020, spending had returned to 2019 levels among households
in the bottom quartile, whereas spending among high-income households
remained {{gap_q4_spend_aug2020}}% below baseline levels.
Spending then continued to rise gradually in subsequent months and
began to exceed pre-COVID levels starting in 2021 for both low- and
high-income groups. The degree of heterogeneity in spending changes
by income is larger than that observed in previous recessions [see
Figure 6 in @petev2011consumption] and played a central role
in the downstream impacts of COVID on businesses and the labor market,
as we show below.

*Heterogeneity Across Sectors.* Next, we disaggregate the change
in total card spending across categories to understand why households
cut spending so rapidly. In particular, we seek to distinguish two
channels: reductions in spending due to loss of income vs. fears of
contracting or spreading COVID.

The left bar in Figure \ref{fig:spending_changes_by_sector} plots
the share of the total decline in spending from the pre-COVID period
to mid-April 2020 accounted for by various categories. {{person_decline}}%
of the reduction in spending came from reduced spending on goods or
services that require in-person contact (and thereby carry a risk
of COVID infection), such as hotels, transportation, and food services.
This is particularly striking given that these goods accounted for
less than one-third of total spending in January, as shown by the
right bar in Figure \ref{fig:spending_changes_by_sector}. These gaps
grew larger as the pandemic progressed, as consumer spending
increased above pre-pandemic levels for goods and remote services
by mid-August 2020, but remained sharply depressed for in-person services
(Appendix Table \ref{tab:spending_changes}, Panel B). The fact that
the spending reductions vary so sharply across goods in line with
their health risks indicates that health concerns (either one's own
health or altruistic concerns about others' health) rather than a
lack of purchasing power drove spending reductions.

These patterns of spending reductions differ sharply from those observed
in prior recessions. Figure \ref{fig:spending_changes_by_sector_covid_gfc}
compares the change in spending across categories in national accounts
data in the COVID recession and the Great Recession in 2009-10. In
the Great Recession, nearly all of the reduction in consumer spending
came from a reduction in spending on goods; spending on services was
almost unchanged. In the COVID recession, {{serv_covid}}% of
the reduction in total spending came from a reduction in spending
on services.

*Heterogeneity by COVID Incidence.* To further evaluate the
role of health concerns, we examine the association between COVID
case rates across areas and changes in spending. Figure \ref{fig:covid_isolation_associations}
shows that spending fell more in counties with higher rates of COVID
infection, in both low- and high-income areas, during the trough in
consumer spending from March 25 to April 14, 2020. However, there
was a substantial reduction in spending even in areas without high
rates of realized COVID infection, consistent with widespread concern
about the disease even in areas where outbreaks were less prevalent.
To examine the mechanism driving these spending reductions, Appendix
Figure \ref{fig:covid_isolation_associations_appendix} uses anonymized
cell phone data from Google to present a binned scatter plot of the
amount of time spent outside home vs. COVID case rates, again separately
for low- and high-income counties. As in Figure \ref{fig:covid_isolation_associations},
there is a strong negative relationship between time spent outside
and COVID case rates, with a steeper slope in low-income counties.
The reduction in spending on services that require physical, in-person
interaction (e.g., restaurants) follows directly from this reduction
in time spent outside.

In sum, disaggregated data on consumer spending reveal that spending
in the initial stages of the pandemic fell primarily because of health
concerns rather than a loss of current or expected income -- consistent
with the mechanisms emphasized by @eichenbaum2020macroeconomics.
Disposable income ultimately fell relatively little because few high-income
individuals lost their jobs (as we show in Section \ref{subsec:Impacts-Employment}
below) and because the income losses of lower-income households who
lost their jobs were more than offset by supplemental unemployment
benefits, stimulus payments, and other transfers [@ganong2020us;
@blanchet2022real]. Next, we turn to the impacts of the spending
reductions induced by these health concerns on businesses and the
labor market.

## Business Revenues

Services that are consumed in person (e.g., restaurants) are typically
produced by small businesses who serve customers in their local area.[^naics_72_size] 
The reduced in-person spending by high-income households documented
above thus has heterogeneous impacts across areas, with businesses
located in more affluent areas facing larger spending shocks. We exploit
this geographic heterogeneity to identify the impacts of the reduction
in consumer spending on businesses and their employees, starting by
examining impacts on small business revenues.[^size_and_geography]

[^naics_72_size]: For example, over 50% of workers in food and accommodation services
(a major non-tradeable sector) work in establishments with fewer than
50 employees [@SUSB2017].

[^size_and_geography]: We focus on small businesses because their customers are typically
located near the business itself; larger businesses' customers (e.g.,
large retail chains) are more dispersed, making the geographic location
of the business less relevant.

*Benchmarking.* We measure small business revenues using data
from Womply, which records revenues from credit card transactions
for small businesses (as defined by the Small Business Administration)
at the location where the sale occurs. Because there is no publicly
available series on small business revenues, we compare trends in
the Womply data to the Affinity consumer spending data. These series
are generally well aligned, especially in sectors with a large share
of small businesses, such as food and accommodation services, where
the RMSE of the Womply series relative to Affinity is {{womply_aff_food_rmse}}pp
(Appendix Figure \ref{fig:revenue_spending}). For retail, where large
businesses have a larger market share, the RMSE is {{womply_aff_ret_rmse}}pp.

*National Trends.* In the aggregate time series (plotted in
Appendix Figure \ref{fig:smallbiz_national_trend}), small business
revenues fell by {{rev_apr2020}}% when the pandemic began and
then recovered to {{rev_jul2020}}% below pre-COVID levels by
July 2020. Small business revenues then remained at that level until
late 2020, reaching pre-COVID levels only in September 2021. The larger
fall and slower recovery of small business revenues relative to total
consumer spending is consistent with evidence that consumer spending
shifted toward large online retailers during the pandemic [@SecondMeasureAlexanderKarger].
Unfortunately, we lack data on revenues at large businesses, so we
cannot examine these impacts directly.

*Heterogeneity Across Areas.* To illustrate the data underlying
our geographic analysis, we map the change in small business revenues
from January 2020 to the period immediately after the COVID shock
(March 23 to April 12, 2020) by ZIP code in New York City, Chicago
and San Francisco (Appendix Figure \ref{fig:zipmaps_smallbiz}).[^zctas]
In all three cities, revenue losses were largest in the most affluent
neighborhoods (e.g. Manhattan in New York and Lincoln Park in Chicago)
and in the central business districts in each city. But even within
predominantly residential areas, businesses located in more affluent
neighborhoods suffered much larger revenue losses.

[^zctas]: We use 2010 Census ZIP Code Tabulation Areas (ZCTAs) to perform all
geographic analyses of ZIP-level data. Throughout the text, we refer
to these areas simply as “ZIP codes”.

Figure \ref{fig:smallbiz_zip_associations} generalizes these examples
by presenting a binned scatter plot of percent changes in small business
revenue vs. median rents (for a two bedroom apartment) by ZIP code.[^rents] 
In the top ventile of ZIP codes by rent, small business revenues
fell by {{revenue_apr_rent_p100}}%, as compared to {{revenue_apr_rent_p5}}%
in the bottom ventile of ZIP codes by rent, consistent with the differences
observed in the Affinity consumer spending data across areas.[^safegraph]

[^rents]: Rents are a simple measure of the affluence of an area that combine
income and population density: the highest rent ZIP codes tend to
be high-income, dense areas such as Manhattan. Plotting small business
revenue against median incomes or population density produces analogous
results (Appendix Figure \ref{fig:smallbiz_zip_associations_appendix}).

[^safegraph]: Part of the reason that revenues fell so sharply in high-rent ZIP
codes is that affluent families moved elsewhere during the pandemic.
To quantify the relative contribution of such “extensive margin”
mechanisms vs. intensive-margin reductions in spending by high-income
households who did not leave, we use aggregated mobile phone data
from SafeGraph [e.g., @allcott2020] to estimate changes in
local population at high frequencies. Although population fell more
in high-rent areas, changes in small business revenues as of April
2020 still exhibit a sharp gradient with respect to local rents even
conditional on SafeGraph-based estimates of population counts ({{smallbiz_rent_safegraph}}%
per $1000 rent, s.e. {{smallbiz_rent_safegraph_se}}).

The business revenue loss vs. rent gradient is similar when we compare
ZIP codes within the same county by regressing revenue changes on
rent with county fixed effects (Table \ref{tab:rent_association}
Panel A, Column 2), or when comparing businesses within the same industry
across ZIP codes using sector fixed effects (Appendix Figure \ref{fig:revenue_rent_sectorfes}).
It also remains similar when controlling for the (pre-COVID) density
of high-wage workers in a ZIP code to account for differences that
may arise from shifts to remote work in business districts (Table
\ref{tab:rent_association} Panel A, Column 3).[^county_1perc]

[^county_1perc]: Of course, households do not restrict their spending solely to businesses
in their own ZIP code. We find similar patterns when zooming out to
the county level. Counties with larger top 1% income shares experienced
larger losses of small business revenue (Appendix Figure \ref{fig:revenue_vs_top1income}).
Poverty rates are not strongly associated with revenue losses at the
county level (Appendix Figure \ref{fig:revenue_sharepoverty}), indicating
that it is the presence of the rich in particular (as opposed to the
middle class) that is most predictive of economic impacts on local
businesses.

In sum, businesses located in dense, affluent areas lost the most
revenue -- consistent with the sharp reduction in spending on in-person
goods and services by high-income households. Next, we examine how
businesses reacted to this loss of revenue, focusing on the incidence
of the shock on their employees.

## Labor Market Impacts {#subsec:Impacts-Employment}

We begin by analyzing how the loss of revenues affected labor demand
using data on job postings from Lightcast. Figure \ref{fig:jobs_loweduc_vs_rent}
presents a binned scatter plot of the change in job postings that
require minimal education between January 2020 and the April 2020
trough vs. median rents by county. Job postings with minimal educational
requirements fell much more sharply in high-rent areas than
for workers in lower-rent areas (difference = {{change_rent_high_low_pp}}pp,
or {{change_rent_high_low_pct}}%), consistent with the larger shocks
to revenue faced by firms located in high-rent areas. By contrast,
postings for jobs that require higher levels of education --- which
are much more likely to be in tradeable sectors that are less influenced
by local conditions (e.g., finance or professional services) ---
exhibit no relationship with local rents (Figure \ref{fig:jobs_higheduc_vs_rent}).

Having established that the pandemic reduced labor demand especially
for lower-skilled workers working in affluent areas, we next turn
to examine its impacts on employment rates using data from payroll
companies.

*Benchmarking.* Our payroll-based employment series is broadly
aligned with measures from nationally representative statistics. Appendix
Figure \ref{fig:employment_benchmark_allindustries} shows that month-on-month
changes in employment rates for all workers estimated from combined
Paychex and Intuit payroll data generally fall between estimates obtained
from the Current Employment Statistics (a survey of businesses) and
Current Population Survey (a survey of households). Turning to specific
sectors, Appendix Figure \ref{fig:employment_benchmark_byindustry}
focuses on month-on-month employment changes in two sectors that experienced
very different trajectories: food services, where employment fell
heavily, and professional services, where it did not. In both cases,
our Paychex-Intuit series closely tracks data from the CES. Appendix
Figure \ref{fig:emp_ces_by_naics} shows more generally that changes
in employment rates across private non-farm sectors (two-digit NAICS)
are very closely aligned in our series and the CES, with a correlation
of {{emp_ces_ind_corr}} when looking at changes from January to July 2020.

Unlike with spending and business revenues, there are publicly available
sources of data on employment rates that can be disaggregated geographically
and used to evaluate the representativeness of our data across areas.
Our employment series closely matches state-level variation in employment
changes during the pandemic in the CES, with a population-weighted
correlation of {{emp_ces_state_corr_2020}} when looking at
changes from January to July 2020 (Appendix Figure \ref{fig:emp_ces_by_state}).
Our estimates are also well aligned with commuting-zone-level estimates
from the Quarterly Census of Employment and Wages (QCEW) (Appendix
Figure \ref{fig:emp_qcew_by_county}). Similarly, disaggregating the
national data by wage rate, we find that our estimates are closely
aligned with estimates based on the Current Population Survey and
estimates in @ADPHurstetal (Appendix Figure \ref{fig:employment_benchmark_adp_cps}).

These comparisons indicate that our combined employment series provides
representative estimates of changes in employment rates across wage
groups and geographic areas during the COVID pandemic. A natural question
going forward is how accurate our local area estimates will be in
more typical periods, where the shocks of interest are likely to be
far smaller than during the pandemic. To evaluate the accuracy of
our data from this broader perspective, we calculate the population-weighted
root-mean-squared-error (RMSE) between our estimates of CZ-level changes
in quarterly employment in January to September 2021 and corresponding
statistics from the QCEW. We find an RMSE of {{rmse_top_50}}pp
for the 50 largest CZs and {{rmse_all}}pp when including all
CZs. Since the QCEW statistics are based on unemployment insurance
records covering the entire population, the RMSE can be loosely interpreted
as the average standard error of our estimate, accounting for noise
arising from both sampling error and changes in non-representative
sampling. The relatively small MSEs indicate that our data can identify
employment shocks considerably smaller than those induced by the pandemic.
For instance, the worst-hit quartile of CZs in the U.S. during the Great
Recession had mean employment losses of {{emp_loss_great_recession_worst}}pp
from 2007 to 2010, while the least-hit quartile of CZs had mean employment
gains of {{emp_loss_great_recession_least}}pp; our data would
have been sufficiently precise to reliably differentiate those CZs.
As another example, @aldy2014 estimates that the 2010 Gulf
Oil spill decreased employment in non-panhandle Gulf-coast Florida
counties by 2.7pp; since the population of this region is equivalent
to the 4th largest CZ (with population of 7 million), our payroll-based
series would have been sufficiently precise to detect and monitor
this effect in near-real-time as well.

The key limitation of publicly available employment data is that existing
data sources can only be disaggregated either by county *or*
wage level. Our payroll-based data sources allow us to measure changes
in employment by county *and* wage level, which we show next
proves to be valuable in understanding the impacts of the COVID shock.[^timely_emp]

[^timely_emp]: Another benefit of our payroll-based employment series is the timeliness
of its local-area estimates: it matches the county-level granularity
of the QCEW (which is released with a lag of 6 months), but with the
timeliness of the monthly employment statistics in the CES that are
released at the national level and for 450 MSAs.

*Heterogeneity by Wage Rates.* Figure \ref{fig:employment_changes_by_income_quartile}
plots employment rates by real pre-pandemic wage quartile. Each series
shows the change in the total number of workers employed in jobs with
hourly wage rates that fall in the relevant quartile of the pre-COVID
wage distribution (with thresholds adjusted over time for inflation
as described in Section \ref{subsec:Data-Employment}) relative to
the baseline level in January 2020.

We find substantial heterogeneity in job losses by wage rate, consistent
with the findings of @ADPHurstetal in prior work using
ADP data. Employment rates fell by {{emp_apr_q1}}% around the
trough of the recession (April 15, 2020) for workers in the bottom wage
quartile (i.e., the total number of jobs paying $<$${{wage_threshold_q1q2}}/hour
in January 2020 was {{emp_apr_q1}}% lower as of April 15).
By contrast, employment rates fell by only {{emp_apr_q4}}%
for those in the top wage quartile (those jobs paying more than ${{wage_threshold_q3q4}}/hour
in January 2020) as of April 15.

High-wage workers not only were less likely to lose their jobs to
begin with, but also recovered their jobs much more quickly. By June
2020 -- just three months after the recession began -- employment
for high-wage workers had nearly returned to the pre-COVID baseline.
Employment rates in low-wage jobs recovered rapidly to {{emp_jul_q1}}%
below baseline levels by summer 2020, but then stalled from that point
onward.

*Heterogeneity Across Areas.* To identify the mechanisms driving
these employment impacts, we again exploit geographic variation, studying
whether employment fell most in the high-rent areas that faced the
largest demand shocks. Figure \ref{fig:employment_vs_rent} plots
changes in bottom-wage-quartile employment rates from January to July
2020 vs. median rents, by county. Consistent with the larger shocks
in high-rent areas to business revenue and labor demand for low-skilled
workers, low-wage employment rates fell much more in more affluent
counties. Low-wage employment fell by {{emp_jul_2020_rent_p100}}%
in the highest-rent counties, compared with {{emp_jul_2020_rent_p5}}%
in the lowest-rent counties. We find a similar pattern at the ZIP
code level using employment data from Earnin (Appendix Figure \ref{fig:employment_vs_rent_zip}).
Table \ref{tab:rent_association}, Panel B presents a set of regression
estimates quantifying these impacts. Low wage employment rates fell
more in higher-rent areas (Column 1), even when controlling for the
density of high-wage workers (Column 2) and comparing ZIP codes within
the same county (Column 3).

The concentration of employment losses in more affluent areas is a
consequence of the specific pattern of demand shocks induced by COVID
rather than a general feature of recessions. Appendix Figure \ref{fig:employment_covid_vs_gfc}
shows that in the Great Recession (from 2007-2010), counties in the
bottom quartile of the household median income distribution accounted
for {{emp_loss_2008_q1}}% of job losses, while those in the
top quartile accounted for {{emp_loss_2008_q4}}% of job losses.
By contrast, in the COVID recession (from January to April 2020),
counties in the top quartile accounted for a larger share of job losses
than counties in the bottom quartile.

In summary, the pandemic led to a short “V-shaped” recession for
high-wage workers, but a prolonged reduction in employment for lower-wage
workers that persisted until at least December 2021, the end of our
analysis period. Geographic disaggregation reveals that the drop in
low-wage employment at the start of the pandemic was driven primarily
by a contraction in spending among high-income individuals -- which
then reduced labor demand for low-skilled workers -- rather than
voluntary reductions in labor supply (that might have been induced,
for example, by health concerns or unemployment benefits). In the
next section, we examine why employment rates remained low even as
the economy began to recover.

## Recovery

By the middle of 2021, aggregate consumer spending (Figure \ref{fig:spending_changes_by_income_quartile})
and small business revenues (Appendix Figure \ref{fig:smallbiz_national_trend})
had met pre-COVID baseline levels and continued to climb upward. Yet employment
rates in jobs that paid wages in the bottom quartile of the pre-pandemic
distribution remained {{emp_dec_q1}}% lower even as of December 2021 (Figure \ref{fig:employment_changes_by_income_quartile}). 
What explains this “jobless recovery” at the bottom of the wage distribution?

*Wage Growth.* Part of the explanation is real wage growth:
wage rates rose faster than the poverty line during the pandemic,
leading some workers to move up out of the bottom wage bin (rather
than into non-employment). To quantify the impact of wage growth,
we seek to measure how much wage rates changed in a given job. Lacking
panel data at the job level, we measure changes in wage rates within
detailed industry, occupation, and demographic cells using data from
the CPS (see Appendix \ref{subsec:Appx-Employment-WageGrowth} for
details). Using the estimated wage growth distribution, we estimate
that {{expl_pp}} percentage points of the reduction in the total
number of workers in the lowest wage group as of December 2021 is
due to wage growth, leaving {{notexpl_pp}}pp due to changes in
employment patterns -- either exits into non-employment or switches
to higher-paying jobs.

We assess the contribution of switching to higher-paying jobs using
two methods: assessing whether the cross-sectional composition of
employment has shifted toward higher-paying jobs and measuring employment
rates by pre-pandemic wage quartile in panel data. To implement the
first test, we measure the wage distribution based on pre-COVID (2019)
wage rates in each industry x occupation x race x gender x region
cell of the Current Population Survey. We find that shifts in the
job distribution across these cells actually led to an *increase*
in the share of workers in the lowest wage group as of December 2021.
To implement the second test, we use the CPS Outgoing Rotation Group
panel, consisting of individuals who responded to CPS survey interviews
spaced twelve months apart. In this panel, non-employment rates as
of July 2020 to February 2021 are {{q1_vs_q4_diff}}pp higher for
those who started out in the bottom wage quartile pre-COVID (between
July 2019 to February 2020) than for those who started out in the top
wage quartile.[^cps_panel_limitations] These findings indicate that exits to non-employment explain most
of the remaining reduction in bottom-quartile employment in the cross-sectional
data after accounting for wage growth.

[^cps_panel_limitations]: We cannot use this panel approach to examine employment beyond February
2021 conditional on pre-COVID wage rates because households responding
to the CPS answer the Outgoing Rotation Group panel questions exactly
twice, twelve months apart.

In the rest of this section, we analyze why low-wage workers remained
out of work at higher rates as of December 2021, distinguishing between
labor demand and supply channels.

*Labor Demand.* Although aggregate demand recovered, consumer
demand may have shifted across sectors and technologies in ways that
reduced labor demand for lower-skilled workers in the U.S. For example,
consumer demand shifted persistently over the course of the pandemic
toward larger companies, online vendors, and certain sectors such
as retail trade [@carman2020; @dunn2020measuring; @SecondMeasureAlexanderKarger]. 
Such companies might have more capital-intensive
production functions or outsource more of their production, leading
to a persistent downward shift in the demand for low-skilled labor
in the U.S. Furthermore, firms may have sought efficiencies in their
production processes and economic activity may have shifted to more
efficient firms during the recession, potentially further reducing
labor demand [@berger2012countercyclical; @Lazear2016; @jaimovich2020job].

To evaluate this demand-side explanation, we first examine the evolution
of aggregate job postings over time in Appendix Figure \ref{fig:jobposts_national_trend}.
Postings for jobs that required minimal or no skills had returned
to pre-COVID levels by mid-2020 and were well *above* pre-COVID
levels in most of 2021 as businesses sought to restaff after reducing
their payrolls earlier in the pandemic, consistent with the findings
of @FORSYTHE2022.[^labor_supply_shortage]

[^labor_supply_shortage]: The high level of job postings in the second half of 2021 may also
reflect a labor supply shortage, as companies had to post more jobs
to fill a given set of positions.

Furthermore, there is no evidence of mismatch in labor demand relative
to the supply of low-wage workers across sectors or places. Figure
\ref{fig:employment_changes_reweighted} plots employment for workers
in the bottom wage quartile, reweighting the series to match baseline
employment shares by county and industry (2 digit NAICS) in the top
wage quartile. This reweighting closes very little of the gap between
the two series, showing that differences in industry and location
do not explain the differences in employment trajectories between
low- and high-wage workers.[^retail_spending_by_wage] 
Similarly, we find no evidence of a spatial mismatch between job
posts and workers: reweighting job postings across counties by the
number of bottom-wage-quartile workers in January 2020 has little
impact on the time series of job postings (Appendix Figure \ref{fig:jobposts_national_trend}).

[^retail_spending_by_wage]: Appendix Figure \ref{fig:employment_changes_retail} presents a specific
example of this result by plotting trends in employment and spending
in the retail trade sector. Total retail spending was {{spend_retail}}%
higher as of December 2021 relative to the pre-COVID baseline. Employment
of high-wage workers was {{emp_retail_q4}}% above baseline
levels, yet employment of low-wage workers was still down by {{emp_retail_q1}}%
in this sector -- as in the economy as a whole.

We conclude that low-wage workers appear to have had considerable
demand for their skills in their own counties, yet chose not to take
jobs that were available.

*Labor Supply.* Given these findings, we next examine mechanisms
that may have led to a reduction in labor supply among low-wage workers.
We begin by analyzing how the labor market recovery differed in high-
vs. low-rent counties, building on the analysis in the previous sections.[^min_wage_states]

[^min_wage_states]: We omit California, Massachusetts, and New York in this cross-sectional
analysis because they each raised their minimum wages during our sample,
leading to a discrete mechanical reduction in the number of bottom-wage-quartile
workers over the course of the pandemic (see Appendix \ref{subsec:Internal-Data-Processing}).

Figure \ref{fig:jobs_rent_dec2021} shows that job postings were approximately
20% above pre-COVID baseline levels in both high-rent and low-rent
counties in December 2021. The gradient in job postings with respect
to rent that emerged when the pandemic hit (Figure \ref{fig:jobs_loweduc_vs_rent})
disappeared entirely by December 2021. Yet low-wage employment rates
remained substantially lower in high-rent areas than low-rent areas
(Figure \ref{fig:emp_rent_dec2021}). In the lowest-rent counties
-- where the initial reduction in aggregate demand was smallest (as
measured by small business revenues and job postings) -- the total
number of workers with jobs in the bottom wage quartile as of December
2021 was {{emp_dec_rent_p5}}% lower than it was pre-COVID.
This {{emp_dec_rent_p5}}% reduction is roughly consistent
with what we would expect based on wage growth (as discussed above),
indicating that employment had roughly fully recovered in places where
the pandemic had minimal effects on aggregate demand initially. In
contrast, in the highest-rent counties, bottom-wage-quartile employment
was {{emp_dec_rent_p100}}% lower in December 2021 than it
was pre-COVID.

Panels C and D of Figure \ref{fig:slopes_over_time} characterize
the evolution of the job postings and employment gradients by county-level
rents by month from April 2020 to December 2021. They plot slopes
from regressions of job postings and low-wage employment rates on
rent across counties (weighted by population) by month. The job postings
gradient begins to flatten starting in January 2021 and disappears
completely by the last quarter of 2021. In stark contrast, the employment
gradient steepens over time and never recovers during the period we
study.[^wage_growth]

[^wage_growth]: The differential changes in employment rates in low-wage jobs across
low vs. high-rent areas are not driven by differential changes in
wage growth rates or occupational switching. Using the approaches
described above at a national level (see Appendix \ref{subsec:Appx-Employment-WageGrowth}
for details), we find that wage growth rates are, if anything, lower
in high-rent states than low-rent states and that rates of switching
to higher-paying jobs are uncorrelated with state-level rents. Furthermore,
the CPS panel shows that employment for workers who started in the
bottom wage quartile pre-pandemic remained lower in high-rent states
in February 2021 (Appendix Figure \ref{fig:cps_panel_rent_gradient}).

The results in Figure \ref{fig:slopes_over_time} suggest that the
places that experienced larger demand shocks initially (namely more
affluent, high-rent areas) exhibit persistent declines in employment
even as of December 2021, despite the fact that labor demand had recovered
fully in those areas by that point. One potential explanation for
this hysteresis in employment rates is a change in preferences or
commitments that workers made when the pandemic hit that induced persistent
changes in labor supply. For example, low-wage workers may have moved
to smaller apartments or changed their living arrangements such that
they could afford to work less when the pandemic hit, and may have
decided that they preferred to retain these arrangements going forward
even when labor demand recovered. Another possibility is that low-wage
workers' human capital decayed and made it more difficult for them
to obtain available jobs.

In Table \ref{tab:low_wage_reduction}, we contextualize the magnitude
of the cross-sectional variation in low-wage employment rates by regressing
changes in bottom-wage-quartile employment rates from January 2020
to December 2021 on median rents by county and other covariates that
reflect contemporaneous economic conditions. Column 1 replicates the
specification in Figure \ref{fig:emp_rent_dec2021}, showing that
employment remains sharply depressed in higher-rent areas (where the
initial aggregate demand shock was more severe) relative to lower-rent
areas in December 2021. In Column 2, we include two variables that
measure contemporaneous health and economic conditions -- the average
COVID case rate from October to December 2021 (a measure of the risk
of COVID exposure) and the number of weeks of unemployment insurance
benefits that individuals were eligible for in their state as of December
2021 -- as well as a set of demographic controls. The inclusion of
these variables does not affect the relationship between median rents
and employment rates significantly.

These estimates imply that the ongoing risk of COVID can explain approximately
{{covid_expl_emp}} percentage points of the {{notexpl_pp}}pp
reduction in bottom-wage-quartile employment that is not due to wage
growth. Similarly, multiplying the coefficient on the UI benefits
variable by the mean number of the weeks of additional UI benefits
for which individuals were eligible in December 2021 implies that
UI benefit extensions account for less than 1pp of the reduction in
bottom-wage-quartile employment. This cross-sectional estimate based
on changes in UI policies across states over time is consistent with
the quasi-experimental elasticities of employment rates with respect
to UI benefit length estimated by @coombs2022, which also
finds that UI benefits appear to have small effects on employment
rates during the pandemic.

Column 3 of Table \ref{tab:low_wage_reduction} presents a variant
of the specification in Column 2, replacing median rent with the change
in bottom-wage-quartile employment rates from January to July 2020
-- the immediate loss in low-wage employment after the shock --
as the key independent variable. We find a positive relationship,
showing that areas where employment fell more in the immediate aftermath
of the pandemic exhibited persistent declines in employment nearly
two years later. 

Column 4 replicates Column 2, replacing the dependent variable with
the change in employment rate for jobs that paid wages in the top
quartile of the pre-pandemic wage distribution. We find no relationship
between top-quartile employment rates and rents, consistent with the
rapid recovery of labor demand and employment for high-skilled workers.

Finally, we evaluate whether changes in the total number of *available*
workers (i.e., the total population of lower-skilled workers in high
rent areas) -- rather than changes in labor supply for a given worker
-- can explain a significant portion of the shortfall in employment
rates. Although the number of immigrants to the United States fell
during the pandemic, CPS data show that trends in total low-wage employment
rates for immigrants and US citizens aged 16 or older were virtually
identical (Appendix Figure \ref{fig:cps_by_citizenship}). Internal
migration from high-rent to lower-rent areas within the United States
also does not explain a significant share of the larger reduction
in employment rates in high-rent areas (Appendix Figure \ref{fig:migration_workplace_rent}).
Demographic trends in aging over this short period are also too small
to explain the shortfall in employment: the working age population
(aged 15-64) grew from {{working_age_pop_2020}} to {{working_age_pop_2022}}
million between January 2020 and 2022 [@OECD2023_pop]. Finally,
the share of individuals who moved to self-employment (and hence were
not available to be low-wage employees) is also too small to explain
the aggregate shortfall in bottom-wage-quartile employment: the self-employed
share of individuals aged 16 or older rose from {{pct_self_employed_jan2020}}%
in January 2020 to {{pct_self_employed_dec2021}}% in December
2021 [@BLS2023_uninc; @BLS2023_inc; @BLS2023_pop].

In sum, the persistent reduction in low-wage employment is not readily
explained by changes in labor demand, changes in contemporaneous incentives
to work such UI benefits or ongoing health risks, or changes in the
total number of workers. Rather, the strongest predictor of the cross-sectional
variation in employment rates in December 2021 are variables that
predict the size of the initial shock to aggregate demand -- echoing
the findings of @yagan2019employment, who documents hysteresis
in the labor markets that were hit hardest in the Great Recession.

# Evaluation of Policy Responses to COVID-19 {#sec:Policy-evals}

In this section, we examine the scope for stabilization policies to
break the chain of events documented above: reductions in spending,
especially by high-income households, were associated with losses
in business revenue and low-wage employment. We begin
by evaluating the stimulus payments made to households during the
pandemic, illustrating how the public data we construct are useful
for real-time policy evaluation. We then briefly discuss other examples
of policy evaluations and conclude by assessing whether the combination
of policy responses was sufficient to stabilize economic activity.

## Stimulus Payments to Households {#subsec:Stimulus-Eval}

The federal government sent a total of $814.4 billion in stimulus
checks to households at three points during the pandemic: April 2020,
January 2021, and March 2021 [@irs_databook_2021]. Were
these stimulus payments successful in boosting consumer spending?

We estimate the causal effect of each of the three stimulus payments
on spending in the first month after receipt, focusing in particular
on heterogeneity across the income distribution. We focus on a one
month horizon because prior work shows that most of the impact of
stimulus payments and tax refunds is concentrated within three months
of receipt [e.g., @sahm2010; @broda2014]. Moreover,
spending impacts in the first month are highly predictive of spending
impacts in the first three months across subgroups [@parker2019,
Table 3].[^long_term_impacts]

[^long_term_impacts]: Prior studies benefited from substantial variation in the timing of
payments, permitting identification over a longer period of time.
In contrast, the stimulus payments we study each largely arrived on
a single day, making it challenging to estimate impacts over longer
horizons without strong assumptions about counterfactual trends.

*April 2020.* The Coronavirus Aid, Relief, and Economic Security
(CARES) Act made direct payments to nearly 160 million people starting
in mid-April 2020. Individuals earning less than $75,000 received
a stimulus payment of $1,200; married couples earning less than $150,000
received a payment of $2,400; and households received an additional
$500 for each dependent they claimed.[^phase_out] IRS statistics show that {{stim1_apr15_share}}% of stimulus
payments made in April were direct-deposited on exactly April 15,
2020, while some households received payments on April 14 [@treasury_statements].

[^phase_out]: The payments were reduced at higher levels of income and phased out
entirely for households with incomes above $99,000 (for single filers
without children) or $198,000 (for married couples without children).

We evaluate the impacts of these stimulus payments on consumer spending
using a high-frequency difference-in-differences research design applied
to our card spending data, comparing daily spending before vs. after
April 15 in 2020 vs. spending on the same calendar date in 2019. To
reduce cyclical fluctuations, we residualize daily spending (indexed
to average levels in January 2019) with respect to day-of-week fixed
effects, which we estimate using data for 2019. We then adjust for
a linear pre-trend in spending (which we assume to be common across
all income quartiles) in order to capture aggregate shocks in spending
during the pandemic in 2020.[^linear_trend].
In order to capture high-frequency changes in spending, we do not
smooth the daily spending using a 7-day moving average, unlike in
preceding sections.

[^linear_trend]: We permit pre-trends because spending fell rapidly for all income
groups in the days immediately preceding the April 15 stimulus payments,
as shown in Figure \ref{fig:spending_changes_by_income_quartile}. 
We assume common trends to maximize precision, as we find no significant
differences in pre-trends in spending across income quartiles in the
25 days preceding the stimulus payments. The differential changes in spending
by income quartile discussed in Section \ref{sec:Impacts} emerged before that period,
immediately after the pandemic began. We also show that not adjusting
for pre-trends at all yields qualitatively similar conclusions in
Appendix Figure \ref{fig:stimulus_not_detrended}.

Figure \ref{fig:stimulus1_eventstudy_lowincome} plots the difference
in daily spending in 2020 vs. 2019 for households who live in ZIP
codes with median household income in the bottom quartile of the national
distribution (which we term “low-income” households for convenience).
Spending increases markedly following the arrival of payments, with
particularly high spending in the days when stimulus checks first
arrived. To quantify the magnitude of the (short-run) impact of the
stimulus on spending, in Appendix Table \ref{tab:policy_stimulus}
we estimate difference-in-differences models using OLS regressions
of daily spending by income quartile (residualized against a common
linear pre-trend) in the 25 days before and after April 15 on an indicator
for being pre vs. post-stimulus interacted with calendar year. To
capture the non-linear dynamics evident in the non-parametric event
study plots, we estimate separate treatment effects for the first
five days starting on April 15 and from the 6th day onwards; see Appendix
\ref{sec:Appx-Stimulus-Policy} for a more detailed description of
our methodology.

Using this approach, we estimate that spending increased by {{effect_April_q1_point}}pp
(s.e. = {{effect_April_q1_se}}) for bottom-income-quartile
households in the first month following the stimulus payments. Accounting
for the fraction of households who actually received stimulus checks
in this group, this estimate translates to an increase in spending
of {{effect_April_q1_dollars}} during the first month after
receiving a $1,200 stimulus check (see Appendix \ref{sec:Appx-Stimulus-Policy}
for more details). The estimates remain stable when varying the window
used to estimate the treatment effect, with point estimates ranging
from {{robustness_April}}, as shown in Appendix Figure \ref{fig:robustness}.

Figure \ref{fig:stimulus1_eventstudy_highincome} repeats this analysis
for high-income households -- those who live in ZIP codes with median
household income in the top quartile of the distribution. Once again,
we see a clear increase in spending in the month after the stimulus
payments were made relative to the month before, although there is
no immediate spike in spending the day the checks were received, as
one might expect given that higher-income households are less likely
to be liquidity constrained at high frequencies. We estimate that
spending for top-income-quartile households increased by {{effect_April_q4_point}}pp.
This smaller percentage point impact is to be expected because higher
income households received smaller stimulus payments both in absolute
terms and as a percentage of their total expenditure. Rescaling this
effect, we estimate that high-income households spent {{effect_April_q4_dollars}}
per $1,200 of stimulus payments received in the first month.

The first bar in each set of bars plotted in Figure \ref{fig:stimulus_effect_sizes}
presents estimates of the impact of the April 2020 stimulus payments
on spending over a one month horizon for each of the four ZIP-income
quartiles. Across the income distribution, households spent a large
fraction of their April 2020 stimulus checks in the month immediately
after receipt, consistent with evidence from confidential data from
JPMorgan Chase account holders subsequently reported by @farrellJPMorgan2020.[^stimulus_sector]

[^stimulus_sector]: Disaggregating the spending data by sector, we find that most of the
additional spending from the April 2020 stimulus went to durable goods
rather than in-person services. The stimulus thus increased the overall
level of spending, but did not channel money back to the businesses
that lost the most revenue due to the COVID shock. These findings
provide evidence for the “broken Keynesian cross” mechanism established
in @guerrieri_macro_covid’s model, where funds are not
recirculated back to the sectors shut down by the pandemic, potentially
diminishing multiplier effects.

*January 2021.* The COVID-Related Tax Relief Act made payments
of $600 per person to most Americans available beginning on January
4, 2021. Eligibility criteria largely followed those for the earlier
round of stimulus, with single households eligible for the full stimulus
amount up to $75,000 in income ($150,000 for married households).
The stimulus amount fell at higher income levels, with childless households
with incomes up to $87,000 (or $174,000 if married filing jointly)
receiving a payment.

To evaluate whether our data could shed light on this policy's impact
sufficiently rapidly to inform the design of future stimulus payments,
we analyzed the effects of the stimulus payments on spending from
January 4 to 19, 2021 and released results publicly on January 26, 2021
[@chetty2021jan]. We use the same difference-in-differences
design we used to study the first stimulus, except that we use December
4 to 14, 2020 as the pre-period rather than the days immediately preceding
the stimulus payments because those days coincide with the Christmas
holiday period, when daily spending exhibits 10 times higher variance
across days (even when looking at changes across years) than during
the first half of December (Appendix Figure \ref{fig:variance_week}).[^pre_period_window] 
We also omit pre-trends here because of the gap created by omission
of the holiday period.

[^pre_period_window]: Using the same 25-day pre-period window as was used for the first
stimulus yields point estimates that are statistically indistinguishable
from those we present, but with much wider confidence intervals due
to the greater noise in the pre-period.

Figure \ref{fig:stimulus2_eventstudy} replicates the series in Figure
\ref{fig:stimulus1_eventstudy_lowincome} and Figure \ref{fig:stimulus1_eventstudy_highincome}
for the January 2021 stimulus, plotting indexed daily changes in spending
in 2021 vs. 2020 for bottom- and top-income-quartile households. Low-income
households increase spending significantly after the arrival of the
January stimulus payments. In contrast, high-income households do
not change their spending levels significantly after January 4, 2021
relative to December 2020. Using difference-in-differences models
analogous to those above, we estimate that low-income households increased
spending over the first month after receiving their stimulus checks
by {{effect_January_q1_point}}pp, while high-income households
increased spending by {{effect_January_q4_point}}pp, an estimate
that is not significantly different from zero. The middle bars shown
in Figure \ref{fig:stimulus_effect_sizes} rescale these estimated
impacts into dollars per $1,200 to facilitate comparisons across
stimulus rounds. While the marginal propensity to consume (MPC) out
of these stimulus payments in the first month fell significantly for
all income groups in January 2021 relative to April 2020, the drop
in the MPC for high-income households was especially large. We estimate
that low-income households spent {{effect_January_q1_dollars}}
per $1,200 of stimulus received, {{pct_diff_q1_apr_jan}}%
smaller than the {{effect_April_q1_dollars}} estimated in April 2020. 
High-income households spent much less of their second stimulus
checks -- from {{effect_April_q4_dollars}} per $1,200 received
in April 2020 to just {{effect_January_q4_dollars}} per $1,200
in January 2021, a reduction of {{pct_diff_q4_apr_jan}}%.[^permutation_test] 
These heterogeneous impacts on spending across income groups are
aligned with results subsequently reported in May 2021 by Greig, Deadman, and Noel
[-@greig2021, Box 1, page 20] using confidential data from JPMorgan Chase.

[^permutation_test]: Both of these estimates are significantly lower (with $p<0.005$)
than those from April 2020 based on a permutation test; see Appendix
Figures \ref{fig:permutations_q1} and \ref{fig:permutations_q4}
for the full distribution of placebo estimates.

In short, this analysis demonstrates that one can gauge the (short-term)
effects of stimulus payments with just two weeks of data after the
payments are made using what are now publicly available data -- enabling
a rapid feedback loop for subsequent policy changes. Indeed, based
on these estimates, we predicted that making further stimulus payments
to high-income households would have modest impacts on their spending,
suggesting that targeting the next round of stimulus towards lower-income
households would save substantial resources that could be used to
support other programs, with minimal impact on economic activity.

*March 2021.* After extensive debate about whether higher-income
households should continue to receive stimulus payments -- including
discussion of the evidence described above [@fortunearticle]
-- Congress passed the American Rescue Plan on March 11, 2021. The
final plan continued to pay the full stimulus amount of $1,400 to
households earning up to $150,000, but phased the payments out more
rapidly beyond that threshold than initially proposed, so that households
with incomes above $80,000 (for single filers without children) or
$160,000 (for married couples without children) received no stimulus.
These revisions reduced the total amount of stimulus payments made
to high-income households by approximately $17 billion relative to
the original proposal in January 2020 [@taxfoundation].

Did the March 2021 stimulus payments in fact have lower impacts on
spending of higher-income households, as predicted based on the January
2021 evidence? Figure \ref{fig:stimulus3_eventstudy} replicates the
preceding figures for the 25 days before and after the March 2021
checks were sent out; here, we use exactly the same estimator as in
the first stimulus, as there are no holiday-induced fluctuations in
the pre-period. Bottom-income-quartile households increased spending
considerably in the days following the March payments, while spending
for high-income households did not change significantly. The third
set of bars in Figure \ref{fig:stimulus_effect_sizes} rescale these
effects into dollar impacts per $1,200 of stimulus payment. The estimated
impacts are much more similar to those observed in January 2021 than
in April 2020, with positive impacts on spending for lower-income
households but near-zero impacts on spending for top-quartile households.

*Discussion.* Why did the marginal propensity to consume out
of cash windfalls fall sharply over the course of the pandemic, especially
for higher-income households? Studies of stimulus payments in prior
recessions find little heterogeneity in MPCs by income, but show that
households with higher liquid wealth balances exhibit lower MPCs [@johnson2006;
@broda2014; @jappelli2014]. In normal times, most households
even in the top income quartile tend to have relatively little *liquid*
wealth [@kaplan2014model], explaining why they exhibit high
MPCs out of windfalls in previous recessions. But during the pandemic,
households started to accumulate substantial liquid wealth because
their incomes remained relatively stable while their spending fell
sharply, as discussed above in Section \ref{sec:Impacts}. The national
savings rate (measured in NIPA Table 2.6) rose from 7.6% in 2019
to 18.5% on average in Q2-Q4 of 2020. Using confidential data from
the JPMorgan Chase, @greig2021 further show that cash balances
in checking accounts rose substantially from January to December 2020,
with the largest increases (in dollars) among high-income households.
Given this rapid growth in liquid wealth, it is not surprising that
high-income households started to spend less of their stimulus payments
over time.[^stimulus_lit]

[^stimulus_lit]: Summarizing the literature on impacts of stimulus payments, @sahm2019
observes that “households with low liquid assets relative to their
income tend to spend more (and more quickly) out of additional income
than those households with ample liquidity.” In normal times, Sahm
observes that “targeting current low-income or low-wealth households
may not identify the households most likely to spend the stimulus,
which could include some wealthy households.” The link between income
and liquid wealth changed during the pandemic, making such targeting
more feasible.

This analysis illustrates the value of real-time estimation of policy
impacts rather than predictions based on historical estimates. Despite
being based on a consensus across a large set of studies, historical
predictions about the lack of heterogeneity in MPCs by income proved
to be inaccurate given the unusual impacts of the pandemic on spending
behavior across the income distribution.[^hindsight] 
The core challenge is that parameters such as MPCs are not invariant
to the economic and policy environment. By directly estimating such
parameters in real time using newly available data, one can make policy
decisions that respond transparently -- based on publicly available
information -- to current economic conditions.

[^hindsight]: With the benefit of hindsight, one may have been able to predict that
MPCs would begin to fall for high-income households as their liquid
savings rose, but it is difficult to gauge ex-ante which of the many
potential dimensions of heterogeneity and structural change warrant
attention.

## Impacts of Other Policies

The data we make publicly available can also be used to study a range
of other policies beyond stimulus payments. For illustration, we briefly
discuss four examples of policies that were implemented during the
COVID-19 crisis. The first two are based on analyses we conduct ourselves
(detailed in Appendix \ref{sec:Supplemental-Analyses}) and the latter two are analyses conducted
by other researchers using our data in combination with other data
sources.

*State-Ordered Shutdowns and Reopenings.* Many states enacted
stay-at-home orders and shutdowns of businesses in an effort to limit
the spread of COVID infection and later reopened their economies by
removing these restrictions. Using our card spending and payroll data,
we evaluate the impacts of these policies using event study designs
that compare trends in states that shut down and re-opened at different
dates. We find that state-ordered shutdowns and reopenings had modest
impacts on economic activity. Spending and employment remained well
below baseline levels even after reopenings, and trended similarly
in states that reopened earlier relative to comparable states that
reopened later (Appendix Figures \ref{fig:reopenings}-\ref{fig:closures}).
Spending and employment also fell well *before* state-level
shutdowns were implemented. These findings are consistent with work
by @goolsbee2021fear and @VillaBoasSearHashtag
using cell phone location data as well as @BartikRothsteinHomebase
using timesheet data on hours of work.

*Paycheck Protection Program.* The Paycheck Protection Program
(PPP) sought to reduce employment losses by providing forgivable loans
worth more than $800 billion in total to small businesses that maintained
sufficiently high employment (relative to pre-crisis levels). Using
our payroll data disaggregated by firm size, we evaluate the impacts
of the PPP on employment by comparing employment trends at firms with
fewer than 500 employees (which were eligible for PPP assistance)
with firms in the same sector that had more than 500 employees (who
were ineligible). We find that employment increased by only {{ppp_eligible_beta}} percentage
points after the PPP was enacted in April 2020 relative to larger
firms that were ineligible for PPP (Appendix Figure \ref{fig:ppp_employment}).
Our point estimates imply that the cost per job saved by the PPP was \${{ppp_cost_per_job_saved}} 
(\${{ppp_cost_per_job_saved_upperCI}} at the upper bound of the 95% confidence interval);
netting out potential UI payments to these potentially unemployed
workers reduces this number only slightly to ${{ppp_cost_per_job_saved_w_UI}} per job saved
(see Appendix \ref{subsec:Paycheck-Protection-Program} for details).
@autor2020evaluation and @autor2022800 reach
similar conclusions using the same research design with microdata
from ADP, another large payroll processor. @PaycheckProtectionGranjaZwick
use a different design, exploiting cross-sectional variation in PPP
takeup driven by bank composition, and reach similar conclusions,
partly drawing upon the data we make publicly available. Together,
all of these studies suggest the PPP had modest marginal impacts on
employment in the short run, likely because the vast majority of PPP
loans went to inframarginal firms that were not planning to lay off
many workers.[^ppp_longrun]

[^ppp_longrun]: This analysis focuses solely on short-run employment effects; it remains
possible that the PPP may have long-term benefits by reducing permanent
business closures, as emphasized by @hubbardppp.

*Unemployment Benefit Increases.* The Federal Pandemic Unemployment
Compensation (FPUC) program paid supplemental unemployment benefits
of up to $600 per week from March to September 2020. @casado2020effect
use county-level variation in wage replacement rates resulting from
differences in industrial composition to estimate the effect of FPUC
payments on aggregate spending. Using our publicly available spending
data combined with UI claims data from Illinois, they estimate that
a 1% increase in the replacement rate increased county-level spending
by 0.167%, which implies that each $1 of UI benefits increased aggregate
spending at the county level by $1.23. For comparison, our estimates
above imply that $1 of spending in the form of stimulus checks increased
household-level spending by an MPC = {{mean_mpc}} on average.
In a standard Keynesian model, an MPC of {{mean_mpc}} would imply an impact
on aggregate spending of $\frac{ {{mean_mpc}} }{ 1 - {{mean_mpc}} } = {{keynesian_impact}}$, an order of magnitude
smaller than the estimated impact of UI benefits. In the pandemic,
where some sectors were effectively shut down, theory suggests that
the multipliers would be even smaller than the standard Keynesian
benchmark [@guerrieri_macro_covid]. This comparison suggests
that UI benefits targeted to unemployed individuals were a more potent
tool to stimulate aggregate spending than stimulus payments to all
individuals, especially later in the pandemic as employed households
built up a large stock of savings.

*Eviction Moratoria.* Many state and local governments enacted
moratoria on tenant eviction during the pandemic to provide stable
housing for those who might have lost their jobs. These moratoria
were implemented at different times in different states and counties.
@an2021more exploit variation in the timing of such moratoria
to estimate their impacts on spending. Using our publicly available
data on consumer spending by category coupled with other sources,
they estimate that a one-week eviction moratorium is associated with
a 1% increase in spending on necessities such as food and groceries.
They conclude that eviction moratoria not only reduced housing instability
but also boosted spending on other goods and potentially provided
an aggregate stimulus as a result.

Methodologically, these examples illustrate that data from private
sector sources can be used to evaluate a wide variety of policies
rapidly because many policies have heterogeneous impacts across geographic
areas or other dimensions, such as firm size. Reassuringly, the findings
obtained from our public statistics match those obtained from studies
with access to the underlying microdata, demonstrating that public
statistics constructed from private sector data sources can support
many policy analyses. Taken together, these studies suggest that policies
targeted directly at households that suffered the largest income losses
-- such as those who became unemployed or faced eviction -- had
the largest impacts on spending and downstream economic activity in
the pandemic.

## Secondary Impacts on Spending {#subsec:Secondary-Spending-Impacts}

We conclude by stepping back from the effects of specific policies
and analyzing whether the combination of government policies -- those
analyzed above as well as other macroeconomic responses and changes
in the economy -- was adequate to stem the downward spiral in economic
activity set off by the initial reduction in consumer spending documented
in Section \ref{sec:Impacts}. Did the loss of jobs among low-wage
workers trigger a secondary reduction in their own spending levels
due to a lack of disposable income (rather than health concerns) ---
potentially setting off further business revenue losses and employment
losses? Or was government intervention adequate to prevent such secondary
responses? We investigate secondary spending responses among low-income
individuals by returning to the geographic heterogeneity in the size
of initial consumer demand shocks by local rent levels, as in Section
\ref{sec:Impacts}. In particular, we compare how spending evolved
in low-income ZIP codes whose residents worked predominantly in high-rent
areas vs. those whose residents worked predominantly in low-rent areas.

Figure \ref{fig:employment_vs_workplacerent_apr2020} presents a binned
scatter plot of changes in low-wage employment from January to April
2020 by *home* (residential) ZIP code vs. average *workplace*
rent. We construct this figure by combining ZIP-code-level data on
employment rates of low-wage workers from Earnin that we make publicly
available with public data from the Census LEHD Origin-Destination
Employment Statistics (LODES) database, which provides information
on the matrix of residential ZIP by work ZIP for low-income workers
in the U.S. in 2017, to compute the average workplace median rent
level for each residential ZIP. Figure \ref{fig:employment_vs_workplacerent_apr2020}
shows that low-income individuals who were working in high-rent areas
pre-COVID were much less likely to be employed after the shock hit
in April 2020 -- consistent with our findings above.[^location_vs_sector]

[^location_vs_sector]: These results are driven by work location rather than sectoral differences
in employment across areas: in the Earnin microdata, we find similar
results even when comparing workers employed at the *same firm*
(e.g., a chain restaurant). People working in high-rent ZIP codes
in January 2020 remained less likely to have a job (anywhere) in April 2020
than their co-workers working in a different establishment of the
same firm in lower-rent ZIP codes.

Next, we analyze how these differential shocks to employment affected
spending patterns, taking a step toward mapping the flow of shocks
in the economy [@andersen2022]. Figure \ref{fig:spending_vs_workplacerent_apr2020}
replicates Figure \ref{fig:employment_vs_workplacerent_apr2020} using
spending changes on the y-axis, restricting to households living in
low-income ZIP codes.[^low_income_zips] Low-income individuals living in areas where people tend to work
in high-rent ZIP codes cut spending by {{apr_spend_workrent_q4}}%
on average from January to April 2020, compared with {{apr_spend_workrent_q1}}%
for those living in areas where people tend to work in low-rent ZIPs (Figure \ref{fig:spending_vs_workplacerent_apr2020}).
The relationship remains similar in magnitude but is less precisely estimated when we compare ZIP codes within the same county (Appendix Table \ref{tab:spending_rent}).

[^low_income_zips]: We restrict this figure to households living in low-income ZIPs because
we cannot disaggregate the Affinity data by individual-level income.
Since the employment data already represent only low-income workers,
we do not restrict to low-income ZIPs in the employment analysis above;
however, the patterns are very similar when restricting to low-income
ZIPs in the Earnin data.

Figure \ref{fig:spending_vs_workplacerent_apr2020} implies that low-income
households who lost their jobs at the start of the pandemic reduced
their own spending more at the start of the pandemic---portending
the start of the downward spiral described above. However, while employment
losses for low-wage workers persisted over time, the reductions in
spending did not. Figure \ref{fig:spending_vs_workplacerent_oct2020}
shows that by October 2020, spending in low-income ZIP codes was slightly
higher than it was pre-COVID on average, and there was no longer any
relationship between workplace rents and spending levels among low-income
households despite the persistence of employment losses in higher-rent
areas (Figure \ref{fig:slopes_over_time_employment_rent}). Figure
\ref{fig:spending_rent_slope_evol} plots the evolution of spending
in bottom-income-quartile ZIP codes that rank in the top quartile
of median workplace rent. Despite the fact that these areas faced
the largest and most persistent employment losses, consumer spending
recovered very rapidly after falling sharply at the onset of the pandemic,
exceeding pre-COVID levels starting in July 2020.

In sum, although a sharp gradient of spending reduction with respect
to employment losses emerged early in the pandemic, it vanished within
a few months. Total spending in areas where many workers had lost
their jobs and remained out of work remained on par with areas where
workers had lost less income -- indicating that the secondary spending
response that could have produced a further downward spiral was effectively
shut down shortly after the crisis began. Losses in earned income
likely did not translate to further spending reductions because the
fiscal response to the crisis (e.g., via extended unemployment benefits
and stimulus payments) actually *increased* the total
disposable income of low-income households [@blanchet2022real;
@ganong2022spending]. These results suggest that as a whole,
macroeconomic policy responses appear to have been effective in limiting
secondary declines in consumer spending as workers lost their jobs
-- perhaps even going beyond what was necessary -- even if they
could not address the losses in employment that arose from the initial
shock to consumer spending driven by health concerns.

# Conclusion {#sec:Conclusion}

Transactional data held by private companies have great potential
for measuring economic activity, but to date have been accessible
only through contracts to work with confidential microdata. In this
paper, we have constructed a public database to measure economic activity
at a high-frequency, granular level using data from private companies.
By systematically cleaning, aggregating, and benchmarking the underlying
micro data, we construct series that can be released publicly without
disclosing sensitive information.

We use this new public database to analyze the economic impacts of
COVID-19, demonstrating two ways in which the data provide a new tool
for empirical macroeconomics. First, the data can be used to rapidly
diagnose the root factors driving an economic crisis by learning from
cross-sectional heterogeneity, since different places and subgroups
often face different shocks. In the case of COVID-19, we find that
a sharp reduction in spending by high-income individuals due to health
concerns led to losses of business revenues and persistent reductions
in low-wage employment in affluent areas. Second, the data permit
rapid, real-time policy evaluation -- as demonstrated by our analyses
showing the changing impacts of fiscal stimulus payments over the
course of the pandemic -- opening a path to fine-tuning policy responses
based on their observed impacts rather relying solely on historical
estimates.

The benefit of constructing a public database to conduct such analyses 
rather than working directly with private firms' confidential data is that 
we centralize the fixed costs of cleaning the data for research purposes. 
This facilitates transparency and reproducibility, and enables researchers 
to readily access this data to conduct a much broader set of analyses. 
For example, the data have been used by local policymakers
to inform local policy responses and forecast tax revenue impacts
(e.g., [Maine](https://www.maine.gov/dafs/economist/sites/maine.gov.dafs.economist/files/inline-files/CEFC_background\%20materials_062520.pdf),
[Missouri](https://showmestrong.mo.gov/dashboard/), [Kansas](https://supporttopeka.com/recovery/),
and [Texas](https://tea.texas.gov/sites/default/files/covid/overview_of_remote_instruction_guidance_for_sy_20-21.pdf)).
They have also been used by Congressional staff to design federal
policies, e.g. predicting the impacts and costs of policies targeted
based on business revenue losses [RESTART Act -@RESTARTact].
And they have been used by other researchers to analyze a broad range
of issues, from constructing price indices that account for changes
in consumption bundles [@cavallo2020inflation] to analyzing
the effects of political views on economic outcomes [@makridis2020cost].

While we have focused here on the short-run impacts of COVID-19, private
sector data can be useful in monitoring impacts of economic shocks
on long-term outcomes as well. As an illustration, Figure \ref{fig:education_by_income_quartile}
plots weekly student engagement on Zearn, an online math platform
used by nearly one million elementary school students as part of their
regular school curriculum (see Appendix \ref{sec:Data-Zearn}). Children
in high-income areas learned less when the COVID crisis hit and schools
shifted to remote instruction, but soon recovered to baseline levels.
By contrast, children in lower-income areas completed {{zearn_q1_may20}}%
fewer lessons than they did pre-pandemic through the end of the school
year. These findings -- first established in May 2020, and confirmed
by subsequent work such as @goldhaber2022 and @jack2022
-- raise the concern that the pandemic may have long-lasting impacts
on low-income families not just through persistent reductions in employment
documented above but also through impacts on the next generation.

Over the 20th century, the Bureau of Economic Analysis built on a
prototype developed by @kuznets1941 to institute surveys
of businesses and households that form the basis for today's National
Income and Product Accounts. The database built here provides a prototype
for a system of more granular, real time national accounts built using
transactional private sector data. The fact that even this first prototype
yields insights that cannot be obtained from existing data suggests
that aggregating data from private companies to construct
public statistics has great potential for improving our understanding
of economic activity and policymaking.

<!----------------------------------------------------------------------------->
<!-- References (leave blank, will be automatically filled) ------------------->
<!----------------------------------------------------------------------------->
\pagebreak
\pdfbookmark[0]{References}{references}
\section*{References}

::: {#refs}
:::

\clearpage